Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Commun ; 15(1): 4025, 2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38740804

RESUMO

Intracellular membranes composing organelles of eukaryotes include membrane proteins playing crucial roles in physiological functions. However, a comprehensive understanding of the cellular responses triggered by intracellular membrane-focused oxidative stress remains elusive. Herein, we report an amphiphilic photocatalyst localised in intracellular membranes to damage membrane proteins oxidatively, resulting in non-canonical pyroptosis. Our developed photocatalysis generates hydroxyl radicals and hydrogen peroxides via water oxidation, which is accelerated under hypoxia. Single-molecule magnetic tweezers reveal that photocatalysis-induced oxidation markedly destabilised membrane protein folding. In cell environment, label-free quantification reveals that oxidative damage occurs primarily in membrane proteins related to protein quality control, thereby aggravating mitochondrial and endoplasmic reticulum stress and inducing lytic cell death. Notably, the photocatalysis activates non-canonical inflammasome caspases, resulting in gasdermin D cleavage to its pore-forming fragment and subsequent pyroptosis. These findings suggest that the oxidation of intracellular membrane proteins triggers non-canonical pyroptosis.


Assuntos
Inflamassomos , Proteínas de Membrana , Oxirredução , Piroptose , Humanos , Inflamassomos/metabolismo , Proteínas de Membrana/metabolismo , Estresse Oxidativo , Catálise , Estresse do Retículo Endoplasmático , Peróxido de Hidrogênio/metabolismo , Proteínas de Ligação a Fosfato/metabolismo , Radical Hidroxila/metabolismo , Mitocôndrias/metabolismo , Membranas Intracelulares/metabolismo , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Camundongos , Animais , Processos Fotoquímicos , Dobramento de Proteína , Caspases/metabolismo , Gasderminas
2.
Anal Chem ; 95(46): 16918-16926, 2023 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-37946317

RESUMO

To gain a better understanding of the complex human immune system, it is necessary to measure and interpret numerous cellular protein expressions at the single cell level. Mass cytometry is a relatively new technology that offers unprecedented information about the protein expression of a single cell. Conversely, the analysis of high-dimensional and multiparametric mass cytometric data sets presents a new computational challenge. For instance, conventional "manual gating" analysis was inefficient and unreliable for multiparametric phenotyping of the heterogeneous immune cellular system; consequently, automated methods have been developed to address the high dimensionality of mass cytometry data and enhance the reproducibility of the analysis. Here, we present CyGate, a semiautomated method for classifying single cells into their respective cell types. CyGate learns a gating strategy from a reference data set, trains a model for cell classification, and then automatically analyzes additional data sets using the trained model. CyGate also supports the machine learning framework for the classification of "ungated" cells, which are typically disregarded by automated methods. CyGate's utility was demonstrated by its high performance in cell type classification and the lowest generalization error on various public data sets when compared to the state-of-the-art semiautomated methods. Notably, CyGate had the shortest execution time, allowing it to scale with a growing number of samples. CyGate is available at https://github.com/seungjinna/cygate.


Assuntos
Biologia Computacional , Aprendizado de Máquina , Humanos , Citometria de Fluxo/métodos , Reprodutibilidade dos Testes , Biologia Computacional/métodos , Algoritmos
3.
Anal Chem ; 95(30): 11193-11200, 2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37459568

RESUMO

Predicting peptide detectability is useful in a variety of mass spectrometry (MS)-based proteomics applications, particularly targeted proteomics. However, most machine learning-based computational methods have relied solely on information from the peptide itself, such as its amino acid sequences or physicochemical properties, despite the fact that peptides detected by MS are dependent on many factors, including protein sample preparation, digestion, separation, ionization, and precursor selection during MS experiments. DbyDeep (Detectability by Deep learning) is an innovative end-to-end LSTM network model for peptide detectability prediction that incorporates sequence contexts of peptides and their cleavage sites (by protease). Utilizing the cleavage site contexts could improve the performance of prediction, and DbyDeep outperformed existing methods in predicting peptides recognizable from multiple MS/MS data sets with diverse species and MS instruments. We argue for the necessity of a learning model that encompasses several contexts associated with peptide detection, as opposed to depending just on peptide sequences. There is a Python implementation of DbyDeep at https://github.com/BISCodeRepo/DbyDeep.


Assuntos
Aprendizado Profundo , Espectrometria de Massas em Tandem , Peptídeos/química , Proteínas , Sequência de Aminoácidos
4.
Bioinformatics ; 38(11): 2980-2987, 2022 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-35441674

RESUMO

MOTIVATION: Tandem mass tag (TMT)-based tandem mass spectrometry (MS/MS) has become the method of choice for the quantification of post-translational modifications in complex mixtures. Many cancer proteogenomic studies have highlighted the importance of large-scale phosphopeptide quantification coupled with TMT labeling. Herein, we propose a predicted Spectral DataBase (pSDB) search strategy called Deephos that can improve both sensitivity and specificity in identifying MS/MS spectra of TMT-labeled phosphopeptides. RESULTS: With deep learning-based fragment ion prediction, we compiled a pSDB of TMT-labeled phosphopeptides generated from ∼8000 human phosphoproteins annotated in UniProt. Deep learning could successfully recognize the fragmentation patterns altered by both TMT labeling and phosphorylation. In addition, we discuss the decoy spectra for false discovery rate (FDR) estimation in the pSDB search. We show that FDR could be inaccurately estimated by the existing decoy spectra generation methods and propose an innovative method to generate decoy spectra for more accurate FDR estimation. The utilities of Deephos were demonstrated in multi-stage analyses (coupled with database searches) of glioblastoma, acute myeloid leukemia and breast cancer phosphoproteomes. AVAILABILITY AND IMPLEMENTATION: Deephos pSDB and the search software are available at https://github.com/seungjinna/deephos.


Assuntos
Fosfopeptídeos , Espectrometria de Massas em Tandem , Humanos , Fosfopeptídeos/análise , Espectrometria de Massas em Tandem/métodos , Algoritmos , Bases de Dados Factuais , Software , Bases de Dados de Proteínas
5.
BMC Bioinformatics ; 23(1): 109, 2022 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-35354356

RESUMO

BACKGROUND: In shotgun proteomics, database search engines have been developed to assign peptides to tandem mass (MS/MS) spectra and at the same time post-processing (or rescoring) approaches over the search results have been proposed to increase the number of confident peptide identifications. The most popular post-processing approaches such as Percolator and PeptideProphet have improved rates of peptide identifications by combining multiple scores from database search engines while applying machine learning techniques. Existing post-processing approaches, however, are limited when dealing with results from new search engines because their features for machine learning must be optimized specifically for each search engine. RESULTS: We propose a universal post-processing tool, called TIDD, which supports confident peptide identifications regardless of the search engine adopted. TIDD can work for any (including newly developed) search engines because it calculates universal features that assess peptide-spectrum match quality while it allows additional features provided by search engines (or users) as well. Even though it relies on universal features independent of search tools, TIDD showed similar or better performance than Percolator in terms of peptide identification. TIDD identified 10.23-38.95% more PSMs than target-decoy estimation for MSFragger, which is not supported by Percolator. TIDD offers an easy-to-use simple graphical user interface for user convenience. CONCLUSIONS: TIDD successfully eliminated the requirement for an optimal feature engineering per database search tool, and thus, can be applied directly to any database search results including newly developed ones.


Assuntos
Algoritmos , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Aprendizado de Máquina , Peptídeos , Espectrometria de Massas em Tandem/métodos
6.
Int J Mol Sci ; 21(18)2020 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-32899552

RESUMO

ß/γ-Crystallins, the main structural protein in human lenses, have highly stable structure for keeping the lens transparent. Their mutations have been linked to cataracts. In this study, we identified 10 new mutations of ß/γ-crystallins in lens proteomic dataset of cataract patients using bioinformatics tools. Of these, two double mutants, S175G/H181Q of ßΒ2-crystallin and P24S/S31G of γD-crystallin, were found mutations occurred in the largest loop linking the distant ß-sheets in the Greek key motif. We selected these double mutants for identifying the properties of these mutations, employing biochemical assay, the identification of protein modifications with nanoUPLC-ESI-TOF tandem MS and examining their structural dynamics with hydrogen/deuterium exchange-mass spectrometry (HDX-MS). We found that both double mutations decrease protein stability and induce the aggregation of ß/γ-crystallin, possibly causing cataracts. This finding suggests that both the double mutants can serve as biomarkers of cataracts.


Assuntos
Catarata/genética , Cadeia B de beta-Cristalina/genética , gama-Cristalinas/genética , Adolescente , Adulto , Idoso , Pré-Escolar , Humanos , Recém-Nascido , Cristalino/metabolismo , Mutação/genética , Agregados Proteicos/genética , Estabilidade Proteica , Proteômica/métodos , Cadeia B de beta-Cristalina/metabolismo , gama-Cristalinas/metabolismo
7.
Comput Struct Biotechnol J ; 18: 1391-1402, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32637038

RESUMO

Mass spectrometry (MS) has made enormous contributions to comprehensive protein identification and quantification in proteomics. MS is also gaining momentum for structural biology in a variety of ways, complementing conventional structural biology techniques. Here, we will review how MS-based techniques, such as hydrogen/deuterium exchange, covalent labeling, and chemical cross-linking, enable the characterization of protein structure, dynamics, and interactions, especially from a perspective of their data analyses. Structural information encoded by chemical probes in intact proteins is decoded by interpreting MS data at a peptide level, i.e., revealing conformational and dynamic changes in local regions of proteins. The structural MS data are not amenable to data analyses in traditional proteomics workflow, requiring dedicated software for each type of data. We first provide basic principles of data interpretation, including isotopic distribution and peptide sequencing. We then focus particularly on computational methods for structural MS data analyses and discuss outstanding challenges in a proteome-wide large scale analysis.

8.
J Proteome Res ; 19(1): 212-220, 2020 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-31714086

RESUMO

Recent sequencing technologies have highlighted translation of untranslated regions (UTRs) in genomes, although it remains unknown whether the translated products persist in a cell. Here, we propose a proteogenomic approach to UTR identification at the proteome level, which has been challenging due to the lack of corresponding sequences required for peptide spectrum matching. We address the challenge with constructing translated UTR (tUTR) database, consisting of all hypothetical sequences that can be translated from UTR by assuming non-AUG initiation at near-cognate start codons and stop codon readthrough. In the analysis of the H1299 cell line mass spectrometry (MS/MS) dataset, the tUTR DB-based proteogenomic approach enabled the detection of 52 5'-UTR and 9 3'-UTR peptides from 45 and 9 genes, respectively. The identified UTR peptides were validated via high spectral similarity with their synthetic peptides. The 5'-UTR peptides pointed out alternative initiation sites with non-AUG start codons, which exactly conformed to Kozak contexts of annotated initiation sites. It is also worth noting that our approach can detect translated amino acid sequences as well as provide evidence for UTR translation, while ribosome profiling provides only the translation evidence. For previously reported stop codon readthrough in MDH1 gene, we could confirm the amino acid inserted during the readthrough. Data are available via ProteomeXchange with identifier PXD016207.


Assuntos
Proteogenômica , Códon de Iniciação , Peptídeos/genética , Espectrometria de Massas em Tandem , Regiões não Traduzidas
9.
J Proteome Res ; 18(10): 3800-3806, 2019 10 04.
Artigo em Inglês | MEDLINE | ID: mdl-31475827

RESUMO

We propose to use cRFP (common Repository of FBS Proteins) in the MS (mass spectrometry) raw data search of cell secretomes. cRFP is a small supplementary sequence list of highly abundant fetal bovine serum proteins added to the reference database in use. The aim behind using cRFP is to prevent the contaminant FBS proteins from being misidentified as other proteins in the reference database, just as we would use cRAP (common Repository of Adventitious Proteins) to prevent contaminant proteins present either by accident or through unavoidable contacts from being misidentified as other proteins. We expect it to be widely used in experiments where the proteins are obtained from serum-free media after thorough washing of the cells, or from a complex media such as SILAC, or from extracellular vesicles directly.


Assuntos
Células Cultivadas/metabolismo , Proteoma/análise , Proteômica/métodos , Soro/química , Animais , Bovinos , Meios de Cultura/química , Bases de Dados de Proteínas , Humanos , Espectrometria de Massas
10.
Anal Chem ; 91(17): 11324-11333, 2019 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-31365238

RESUMO

Post-translational modifications regulate various cellular processes and are of great biological interest. Unrestrictive searches of mass spectrometry data enable the detection of any type of modification. Here we propose MODplus, which makes practical unrestrictive searches possible by allowing (1) hundreds of modifications, (2) multiple modifications per peptide, (3) the whole proteome database, and (4) any tolerant values in search parameters. The utility of MODplus was demonstrated in large human data sets of HEK293 cells and TMT-labeled phosphorylation enrichment. Notably, MODplus supports identifying different modification types at multiple sites and reports real chemical and biological modifications, as it has been very labor intensive to link unrestrictive search results to real modifications. We also confirmed the presence of Missing Precursor (MP) spectra that were not identifiable using targeted precursor masses. The MP spectra mostly resulted in identifications of wrong modifications and negatively affected the overall performance, often by as much as 10%. MODplus can rapidly recognize MP spectra and correct their identifications, resulting in increased identification rate up to 70% in the HEK293 data set as well as improved reliability.


Assuntos
Espectrometria de Massas/métodos , Processamento de Proteína Pós-Traducional , Software , Bases de Dados de Proteínas , Conjuntos de Dados como Assunto/normas , Células HEK293 , Humanos , Proteômica/métodos , Reprodutibilidade dos Testes , Erro Científico Experimental
11.
Sci Rep ; 9(1): 3176, 2019 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-30816214

RESUMO

Characterization of protein structural changes in response to protein modifications, ligand or chemical binding, or protein-protein interactions is essential for understanding protein function and its regulation. Amide hydrogen/deuterium exchange (HDX) coupled with mass spectrometry (MS) is one of the most favorable tools for characterizing the protein dynamics and changes of protein conformation. However, currently the analysis of HDX-MS data is not up to its full power as it still requires manual validation by mass spectrometry experts. Especially, with the advent of high throughput technologies, the data size grows everyday and an automated tool is essential for the analysis. Here, we introduce a fully automated software, referred to as 'deMix', for the HDX-MS data analysis. deMix deals directly with the deuterated isotopic distributions, but not considering their centroid masses and is designed to be robust over random noises. In addition, unlike the existing approaches that can only determine a single state from an isotopic distribution, deMix can also detect a bimodal deuterated distribution, arising from EX1 behavior or heterogeneous peptides in conformational isomer proteins. Furthermore, deMix comes with visualization software to facilitate validation and representation of the analysis results.


Assuntos
Espectrometria de Massa com Troca Hidrogênio-Deutério/métodos , Proteínas/ultraestrutura , Software , Conformação Proteica , Proteínas/química
12.
Mol Cell Proteomics ; 16(12): 2111-2124, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29046389

RESUMO

Immunotherapy is becoming increasingly important in the fight against cancers, using and manipulating the body's immune response to treat tumors. Understanding the immune repertoire-the collection of immunological proteins-of treated and untreated cells is possible at the genomic, but technically difficult at the protein level. Standard protein databases do not include the highly divergent sequences of somatic rearranged immunoglobulin genes, and may lead to miss identifications in a mass spectrometry search. We introduce a novel proteogenomic approach, AbScan, to identify these highly variable antibody peptides, by developing a customized antibody database construction method using RNA-seq reads aligned to immunoglobulin (Ig) genes.AbScan starts by filtering transcript (RNA-seq) reads that match the template for Ig genes. The retained reads are used to construct a repertoire graph using the "split" de Bruijn graph: a graph structure that improves on the standard de Bruijn graph to capture the high diversity of Ig genes in a compact manner. AbScan corrects for sequencing errors, and converts the graph to a format suitable for searching with MS/MS search tools. We used AbScan to create an antibody database from 90 RNA-seq colorectal tumor samples. Next, we used proteogenomic analysis to search MS/MS spectra of matched colorectal samples from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) against the AbScan generated database. AbScan identified 1,940 distinct antibody peptides. Correlating with previously identified Single Amino-Acid Variants (SAAVs) in the tumor samples, we identified 163 pairs (antibody peptide, SAAV) with significant cooccurrence pattern in the 90 samples. The presence of coexpressed antibody and mutated peptides was correlated with survival time of the individuals. Our results suggest that AbScan (https://github.com/csw407/AbScan.git) is an effective tool for a proteomic exploration of the immune response in cancers.


Assuntos
Neoplasias Colorretais/imunologia , Genômica/métodos , Imunoglobulinas/química , Peptídeos/genética , Proteômica/métodos , Algoritmos , Linhagem Celular Tumoral , Neoplasias Colorretais/genética , Bases de Dados Genéticas , Bases de Dados de Proteínas , Humanos , Imunoglobulinas/genética , Peptídeos/química , Análise de Sequência de RNA , Espectrometria de Massas em Tandem
13.
Mol Cell Proteomics ; 15(11): 3501-3512, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27609420

RESUMO

Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.


Assuntos
Cyanothece/metabolismo , Peptídeos/análise , Proteômica/métodos , Algoritmos , Proteínas de Bactérias/metabolismo , Análise por Conglomerados , Cyanothece/classificação , Bases de Dados de Proteínas , Genoma Bacteriano , Análise de Sequência de Proteína , Software , Espectrometria de Massas em Tandem/métodos
14.
J Proteome Res ; 14(9): 3555-67, 2015 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-26139413

RESUMO

Aiming toward an improved understanding of the regulation of proteins in cancer, recent studies from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) have focused on analyzing cancer tissue using proteomic technologies and workflows. Although many proteogenomics approaches for the study of cancer samples have been proposed, serious methodological challenges remain, especially in the identification of multiple mutational variants or structural variations such as fusion gene events. In addition, although immune system genes play an important role in cancer, identification of IgG peptides remains challenging in proteomic data sets. Here, we describe an integrative proteogenomic method that extends the limit of proteogenomic searches to identify multiple variant peptides as well as immunoglobulin gene variations/rearrangements using customized mining of RNA-seq data. Our results also provide the first extensive characterization of tumor immune response and demonstrate the potential of this method to improve the molecular characterization of tumor subtypes.


Assuntos
Genômica , Imunoglobulinas/química , Mutação , Peptídeos/genética , Proteômica , Processamento Alternativo , Sequência de Aminoácidos , Bases de Dados de Proteínas , Humanos , Dados de Sequência Molecular , Peptídeos/química , Espectrometria de Massas em Tandem
15.
Mol Biosyst ; 11(4): 1156-64, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25703060

RESUMO

The identification of disulfide bonds provides critical information regarding the structure and function of a protein and is a key aspect in understanding signaling cascades in biological systems. Recent proteomic approaches using digestion enzymes have facilitated the characterization of disulfide-bonds and/or oxidized products from cysteine residues, although these methods have limitations in the application of MS/MS. For example, protein digestion to obtain the native form of disulfide bonds results in short lengths of amino acids, which can cause ambiguous MS/MS analysis due to false positive identifications. In this study we propose a new approach, termed planned digestion, to obtain sufficient amino acid lengths after cleavage for proteomic approaches. Application of the DBond software to planned digestion of specific proteins accurately identified disulfide-linked peptides. RNase A was used as a model protein in this study because the disulfide bonds of this protein have been well characterized. Application of this approach to peptides digested with Asp-N/C (chemical digestion) and trypsin under acid hydrolysis conditions identified the four native disulfide bonds of RNase A. Missed cleavages introduced by trypsin treatment for only 3 hours generated sufficient lengths of amino acids for identification of the disulfide bonds. Analysis using MS/MS successfully showed additional fragmentation patterns that are cleavage products of S-S and C-S bonds of disulfide-linkage peptides. These fragmentation patterns generate thioaldehydes, persulfide, and dehydroalanine. This approach of planned digestion with missed cleavages using the DBond algorithm could be applied to other proteins to determine their disulfide linkage and the oxidation patterns of cysteine residues.


Assuntos
Dissulfetos/química , Fragmentos de Peptídeos/química , Proteínas/química , Proteômica/métodos , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Dissulfetos/análise , Dados de Sequência Molecular , Fragmentos de Peptídeos/análise , Fragmentos de Peptídeos/metabolismo , Proteínas/análise , Tripsina/metabolismo
16.
Mass Spectrom Rev ; 34(2): 133-47, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-24889695

RESUMO

Post-translational modifications (PTMs) are critical to almost all aspects of complex processes of the cell. Identification of PTMs is one of the biggest challenges for proteomics, and there have been many computational studies for the analysis of PTMs from tandem mass spectrometry (MS/MS). Most early PTM identification studies have been performed by matching MS/MS data to protein databases, using database search tools, but they are prohibitively slow when a large number of PTMs is given as a search parameter. In this article, we present recent developments to search for more types of PTMs and to speed up the search, and discuss many computational issues and solutions in terms of identifying multiply modified peptides or searching for all possible modifications at once in unrestrictive mode. Apart from the most common type of PTMs involving covalent addition of functional groups to proteins, PTMs such as disulfide linkage require dedicated software for the analysis because they may involve cross-linking between two different parts of proteins. Finally, methods for identification of protein disulfide bonds are presented.


Assuntos
Dissulfetos/análise , Fragmentos de Peptídeos/análise , Processamento de Proteína Pós-Traducional , Proteínas/metabolismo , Software , Algoritmos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Dissulfetos/química , Dados de Sequência Molecular , Oxirredução , Proteínas/química , Proteômica/instrumentação , Proteômica/métodos , Espectrometria de Massas em Tandem
17.
Proteomics ; 14(23-24): 2719-30, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25263569

RESUMO

Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular subtyping of cancers, understanding cancer progression, and the discovery of novel biomarkers. The advances of genomics technologies (whole-genome exome, and transcript sequencing, collectively referred to as NGS (next-generation sequencing)) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome translated portion of aberrant genes using only genomic approaches. Combination of proteomic and genomic technologies are increasingly being employed. Various strategies have been employed to allow the usage of large-scale NGS data for conventional MS/MS searches. This paper provides a discussion of applying different strategies relating to large database search, and FDR (false discovery rate) -based error control, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any MS sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database that contained 2787062 novel splice junctions, 38,464 deletions, 1,105 insertions, and 182,302 substitutions. Proteomic data from a single ovarian carcinoma sample (439,858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65,578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame-shifts, and nonsample-recruited mutations, which emphasize the strength of our approach.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/metabolismo , Proteômica/métodos , Bases de Dados de Proteínas , Humanos , Neoplasias/genética , Peptídeos/genética
18.
PLoS One ; 8(12): e81734, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24312579

RESUMO

Twenty different aminoacyl-tRNA synthetases (ARSs) link each amino acid to their cognate tRNAs. Individual ARSs are also associated with various non-canonical activities involved in neuronal diseases, cancer and autoimmune diseases. Among them, eight ARSs (D, EP, I, K, L, M, Q and RARS), together with three ARS-interacting multifunctional proteins (AIMPs), are currently known to assemble the multi-synthetase complex (MSC). However, the cellular function and global topology of MSC remain unclear. In order to understand the complex interaction within MSC, we conducted affinity purification-mass spectrometry (AP-MS) using each of AIMP1, AIMP2 and KARS as a bait protein. Mass spectrometric data were funneled into SAINT software to distinguish true interactions from background contaminants. A total of 40, 134, 101 proteins in each bait scored over 0.9 of SAINT probability in HEK 293T cells. Complex-forming ARSs, such as DARS, EPRS, IARS, Kars, LARS, MARS, QARS and RARS, were constantly found to interact with each bait. Variants such as, AIMP2-DX2 and AIMP1 isoform 2 were found with specific peptides in KARS precipitates. Relative enrichment analysis of the mass spectrometric data demonstrated that TARSL2 (threonyl-tRNA synthetase like-2) was highly enriched with the ARS-core complex. The interaction was further confirmed by coimmunoprecipitation of TARSL2 with other ARS core-complex components. We suggest TARSL2 as a new component of ARS core-complex.


Assuntos
Aminoacil-tRNA Sintetases/química , Aminoacil-tRNA Sintetases/metabolismo , Cromatografia de Afinidade , Biologia Computacional/métodos , Espectrometria de Massas , Mapeamento de Interação de Proteínas/métodos , Treonina-tRNA Ligase/análise , Treonina-tRNA Ligase/metabolismo , Algoritmos , Sequência de Aminoácidos , Proteínas de Transporte/química , Proteínas de Transporte/metabolismo , Citocinas/química , Citocinas/metabolismo , Células HEK293 , Humanos , Lisina-tRNA Ligase/metabolismo , Dados de Sequência Molecular , Proteínas de Neoplasias/química , Proteínas de Neoplasias/metabolismo , Proteínas Nucleares , Processamento de Proteína Pós-Traducional , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , Treonina-tRNA Ligase/isolamento & purificação
19.
J Proteome Res ; 11(9): 4488-98, 2012 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-22779694

RESUMO

Selenoproteins, containing selenocysteine (Sec, U) as the 21st amino acid in the genetic code, are well conserved from bacteria to human, except yeast and higher plants that miss the Sec insertion machinery. Determination of Sec association is important to find substrates and to understand redox action of selenoproteins. While mass spectrometry (MS) has become a common and powerful tool to determine an amino acid sequence of a protein, identification of a protein sequence containing Sec was not easy using MS because of the limited stability of Sec in selenoproteins. Se has six naturally occurring isotopes, 74Se, 76Se, 77Se, 78Se, 8°Se, and 8²Se, and 8°Se is the most abundant isotope. These characteristics provide a good indicator for selenopeptides but make it difficult to detect selenopeptides using software analysis tools developed for common peptides. Thus, previous reports verified MS scans of selenopeptides by manual inspection. None of the fully automated algorithms have taken into account the isotopes of Se, leading to the wrong interpretation for selenopeptides. In this paper, we present an algorithm to determine monoisotopic masses of selenocysteine-containing polypeptides. Our algorithm is based on a theoretical model for an isotopic distribution of a selenopeptide, which regards peak intensities in an isotopic distribution as the natural abundances of C, H, N, O, S, and Se. Our algorithm uses two kinds of isotopic peak intensity ratios: one for two adjacent peaks and another for two distant peaks. It is shown that our algorithm for selenopeptides performs accurately, which was demonstrated with two LC-MS/MS data sets. Using this algorithm, we have successfully identified the Sec-Cys and Sec-Sec cross-linking of glutaredoxin 1 (GRX1) from mass spectra obtained by UPLC-ESI-q-TOF instrument.


Assuntos
Algoritmos , Espectrometria de Massas/métodos , Modelos Químicos , Peptídeos/química , Selenocisteína/química , Selenoproteínas/química , Sequência de Aminoácidos , Isótopos/química , Dados de Sequência Molecular
20.
Mol Cell Proteomics ; 11(4): M111.010199, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22186716

RESUMO

With great biological interest in post-translational modifications (PTMs), various approaches have been introduced to identify PTMs using MS/MS. Recent developments for PTM identification have focused on an unrestrictive approach that searches MS/MS spectra for all known and possibly even unknown types of PTMs at once. However, the resulting expanded search space requires much longer search time and also increases the number of false positives (incorrect identifications) and false negatives (missed true identifications), thus creating a bottleneck in high throughput analysis. Here we introduce MODa, a novel "multi-blind" spectral alignment algorithm that allows for fast unrestrictive PTM searches with no limitation on the number of modifications per peptide while featuring over an order of magnitude speedup in relation to existing approaches. We demonstrate the sensitivity of MODa on human shotgun proteomics data where it reveals multiple mutations, a wide range of modifications (including glycosylation), and evidence for several putative novel modifications. Based on the reported findings, we argue that the efficiency and sensitivity of MODa make it the first unrestrictive search tool with the potential to fully replace conventional restrictive identification of proteomics mass spectrometry data.


Assuntos
Algoritmos , Processamento de Proteína Pós-Traducional , Proteínas/metabolismo , Proteômica/métodos , Bases de Dados de Proteínas , Células HEK293 , Humanos , Cristalino/metabolismo , Mutação , Proteínas/genética , Proteoma , Espectrometria de Massas em Tandem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...